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Research suggests that the presence of a non-referent from the same category as 
the referent interferes with anaphor resolution. In five experiments, the hypothesis that 
multiple non-referents would produce a cumulative interference effect (i.e., a fan effect) 
was examined. This hypothesis was supported in Experiments 1A and IB, with subjects 
being less accurate and slower to recognize referents (1 A) and non-referents (1 B) as the 
number of potential referents increased from two to five. Surprisingly, the number of 
potential referents led to a decrease in anaphor reading times. The results of Experiments 
2A and 2B replicated the probe-recognition results in a completely within-subjects design 
and ruled out the possibility that a speeded-reading strategy led to the fan-effect findings. 
The results of Experiment 3 provided evidence that subjects were resolving the anaphors. 
These results suggest that multiple non-referents do produce a cumulative interference 
effect; however, additional research is necessary to explore the effect on anaphor reading 
times. 
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INTRODUCTION 

Many theorists have argued that language comprehension pro- 
cesses can be explained in large part by appealing to general 
memory processes (e.g., Lewis, 1996; Gerrig and McKoon, 1998; 
Myers and O'Brien, 1998; Lewis and Vasishth, 2005; van den 
Broek et al., 2005); this hypothesis has been widely supported 
by empirical evidence. For example, general theories of memory 
processes have been shown to provide explanations for linguistic 
tasks such as establishing common ground between multiple par- 
ties (Horton and Gerrig, 2005) and resolving anaphors (O'Brien 
et al., 1990; Almor, 1999). Anaphor comprehension (often called 
anaphor resolution) in particular appears to rely heavily upon 
memory to determine co-reference between an anaphor and 
antecedent. Even within a sentence, limitations on working mem- 
ory capacity induce the need for retrieval of referents (McElree, 
2000). There are also instances, such as pronouns that refer to 
implicit referents (Greene et al., 1994) and bridging inferences 
(Garrod and Sanford, 1981), where anaphors are resolved even 
though the intended referent has not been explicitly mentioned. 
Such processes clearly rely on memory to produce an acceptable 
referent. Further evidence for the relationship between memory 
and anaphor resolution is provided by the findings that many 
factors affecting memory also affect anaphor resolution, includ- 
ing distance and elaboration (O'Brien et al., 1990), salience of the 
anaphor (Klin et al., 2004), salience of the referent (Foraker and 
McEIree, 2007), and frequency (van Gompel and Majid, 2004). In 
the research reported here, we focus on anaphor resolution across 
sentences. Nevertheless, models of retrieval processes both across 
(Myers and O'Brien, 1998) and within (e.g., Lewis and Vasishth, 
2005) sentences have many commonalities, which we highlight 
below. 



Of particular interest for the current research are studies 
that have examined the effects of multiple potential referents on 
anaphor resolution (e.g., Corbettand Chang, 1983; Corbett, 1984; 
Mason, 1997; Levine et al., 2000; Wiley et al, 2001; Badecker 
and Straub, 2002; Klin et al, 2004, 2006; Ditman et al, 2007; 
Levine and Hagaman, 2008). In one of the first studies examin- 
ing the effect of multiple potential referents, Corbett found longer 
reading time for an anaphoric noun phrase (e.g., the frozen veg- 
etable) that included a category label when a text contained two 
members of that category (e.g., fresh corn and frozen asparagus) 
than when there was only a single category member (e.g., frozen 
asparagus). Badecker and Straub similarly found an increase in 
reading time shortly after subjects read reflexives when multi- 
ple gender-matched referents had been mentioned (e.g., John 
thought that Bill owed himself another opportunity to solve the 
problem). Levine et al. (see also Klin et al., 2004, 2006) found 
evidence suggesting that under some conditions anaphors (e.g., 
the dessert) appear not to be resolved at all when a text con- 
tains two potential referents from the same category (e.g., tart 
and cake), likely due to the increased difficulty in identif)fing a 
unique referent. The increased difficulty in processing anaphors 
in these studies suggests that readers engage in additional process- 
ing when a distractor (i.e., a non-referent) is present. Presumably 
this occurs because the both nouns are considered as potential ref- 
erents, a process that is initiated by simple memory matching and 
that leads to retrieval-based interference. This explanation fol- 
lows straightforwardly from global memory models (e.g., Ratcliff 
1978; Gillund and Shiffrin, 1984; Hintzman, 1986), which assume 
that stored memory representations that are related to a mem- 
ory cue are activated in parallel and to the degree that they 
share features with the memory cue. Somewhat surprisingly, this 
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additional processing appears to occur regardless of disambiguat- 
ing material that should identify the proper referent, such as a 
prenominal adjective like frozen or the grammatical constraints 
that govern interpretation of reflexives (e.g., Reinhart, 1983). The 
reliability and time course of distractor interference, especially for 
within-sentence retrieval, is a matter of debate. Recent evidence is 
consistent with a very early role for grammatical constraints in 
retrieval. For example. Chow et al. (2014) were unable to repli- 
cate Badecker and Straub's results, and they found evidence that 
grammatical constraints prevent distractor interference (see also 
Dillon et al., 2013). Across sentence boundaries, some features, 
such as parallel structure (e.g.. Josh criticized Paul. Then Marie 
insulted him.), may play an early role in limiting referent search 
(Chambers and Smyth, 1998). Nevertheless, for definite noun- 
phrase anaphors like the dessert, reported findings suggest that 
retrieval processes rely on semantic matching between an anaphor 
and potential referents, with no evidence as yet indicating that 
there are grammatical constraints on this process. 

Whereas results like those from Badecker and Straub (2002), 
Corbett (1984), and Klin and colleagues (Levine et al, 2000; 
Klin et al, 2004, 2006) illustrate indirectly that distractors are 
considered during anaphor resolution, direct evidence that dis- 
tractors are activated during anaphor resolution comes from 
results reported by O'Brien et al. (1990). O'Brien et al. had sub- 
jects read passages with two potential antecedents (e.g., train and 
plane), which appeared early and late in a passage and were some- 
times described elaborately. At the end of a passage, a sentence 
(e.g., Mark's neighbor asked him how he had traveled to his parent's) 
appeared that required retrieval of only one of the antecedents. 
Following this sentence, subjects had to name aloud one of the 
potential antecedent nouns. Relative to a no-anaphor control 
condition, referent nouns were named more quickly, replicating 
findings that suggest that referents are activated by anaphor reso- 
lution processes (e.g., Dellet al., 1983). Of perhaps greater interest 
was the finding that non-referent concepts were also activated 
relative to a control condition, especially when they were elabo- 
rated and appeared in the late position in the passage, between the 
anaphor and the correct antecedent. These results are consistent 
with the hypothesis that an anaphor acts like any other mem- 
ory cue, activating related information in parallel. The finding 
that non-referent concepts were activated, especially when they 
occurred late and were elaborated, again fits very well with well- 
established findings from the memory literature that recency and 
elaboration lead to easier memory access. 

Taken together, these studies demonstrate that people consider 
multiple potential referents when resolving anaphors, and far- 
ther, that the resolution of the anaphor increases activation for 
the referent. However, studies involving distractors have typically 
been limited to situations with a single distractor. Therefore, the 
effect of additional distractors remains an open empirical ques- 
tion. A yet-stronger case that general memory processes govern 
anaphor resolution can be made if there is a cumulative effect 
of additional distractors. Both Myers and O'Brien's (1998) reso- 
nance model and Lewis and Vasishth's (2005) implementation of 
ACT-R (e.g., Anderson, 2005) as a theory of memory-retrieval in 
sentence-processing make similar predictions about the effect of 
multiple distractors. The resonance model states that elements in 



the mental representation resonate to signals from retrieval cues. 
In the case of anaphor resolution, the retrieval cue is the anaphor 
and the resonating elements are related items in the mental rep- 
resentation. Critically, the signal (i.e., resonance strength) of any 
item in the representation is divided among receiving elements, 
and only a subset of the elements with the strongest signal enter 
working memory (WM). Thus, the strength of a referent will be 
reduced in the presence of related distractors, reducing the proba- 
bility that the correct referent will be selected into WM. Similarly, 
Lewis and Vasishth's model states that the activation that a chunk 
in memory will receive is reduced as there are more chunks in 
memory associated with a particular cue. Given the assumption 
that activation determines retrieval latency and the probability 
of the retrieval of a memory chunk, there should be greater dif- 
ficulty in retrieving the correct referent with every additional 
distractor. 

We can also draw on the memory literature to provide empir- 
ical guidance about the possible effects of multiple distractors. 
Specifically, research has shown that reading sentences that pair 
a person with multiple locations (or a location with multiple 
people) slows later recognition of the sentences (Anderson, 1974; 
Radvansky, 1998; Anderson and Reder, 1999). This result, known 
as the fan effect, is hypothesized to occur because of interference 
among competing associations in memory. Unlike the anaphor 
literature, which has focused on single distractors, the fan effect 
literature has explored situations with more than two associations 
and has demonstrated a cumulative effect, such that additional 
associations cause additional interference. 

In the original demonstration of the fan effect (Anderson, 
1974), subjects studied sentences in which a person was paired 
with a location (see 1-4 below). 

( 1 ) A hippie is in the park. 

(2) A hippie is in the church. 

(3) A policeman is in the park. 

(4) A sailor is in the park. 

Importantly, some people were associated with more than one 
location and some locations were associated with more than one 
person. For example, the sailor was associated only with the park 
(i.e., a fan of one), the hippie was associated with both the park 
and the church (i.e., a fan of two), and the park was associ- 
ated with hippie, the policeman, and the sailor (i.e., a fan of 
three). Thus, the nouns varied in the number of associations with 
other nouns. After the study phase, subjects read another set of 
sentences, some of which were the same as those studied pre- 
viously and some of which were novel pairings of people and 
locations that the subjects had not seen. For each sentence, sub- 
jects indicated whether it was the same as one they had read 
during the study phase or not. Consistent with the hypothesis that 
multiple associations interfere with one another, subjects were 
slower to recognize sentences with nouns that were associated 
with more nouns compared to sentences with nouns associated 
with fewer nouns. That is, subjects were slower to respond as the 
size of the noun's fan increased. 

If anaphor resolution relies on general memory processes, 
and increasing the number of associations with a noun increases 
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interference, then we can predict that increasing the total number 
of potential referents for an anaphor should also show a cumu- 
lative retrieval-interference effect (i.e., a fan effect). The present 
study tested this prediction across five experiments by exploring 
the effects of multiple distractors on anaphor resolution and the 
subsequent activation levels of referents and distractors. In par- 
ticular, we used a probe recognition task after anaphor sentences 
to measure the relative activation of an anaphoric referent when 
there were a variable number of distractors. We also used the 
probe task to measure activation of those distractors as a function 
of the number of distractors. Our results demonstrate evidence of 
a fan effect in anaphor resolution. 

EXPERIMENT 1A 

In Experiment lA, subjects read pairs of sentences. The first 
provided an antecedent and one or more distractors in a serial 
list, and the second included an anaphoric noun phrase that 
co-referred with the antecedent; these were followed by a probe 
recognition task that was used to measure the activation of the ref- 
erent concept (see Table 1 for a sample passage and Appendix A in 
Supplementary Materials for a full list of experimental passages). 
In particular, the first sentence ended with a list of two, three, 
four, or five potential referents from the same taxonomic category, 
and the second sentence referred with a disambiguating adjective 
and categorical anaphor to a single item mentioned in the list. 
Following each sentence-pair, subjects completed a probe recog- 
nition task to measure the activation level of the referent following 
the anaphor. For example, the first sentence in the example in 
Table 1 describes a person looking through a toolbox with a num- 
ber of tools in it. The last tool mentioned in the sentence, a saw, 
is the antecedent concept. The second sentence then describes the 
person fixing a table using the cutting tool. The latter noun phrase 
serves as an unambiguous reference to the entity introduced by 
the antecedent. After the second sentence was completed, the 
word saw was presented in an old-new recognition task, the cor- 
rect response for which is "old." We assume that reaction time and 



Table 1 | Sample passage. 


List sentence 


Amelia's new table was wobbling, so she 




looked in her toolbox and found . . . 


Two-noun 


... a hammer and a saw. (all experiments) 


Three-noun 


... a screwdriver, a hammer, and a saw. 




(Experiments 1 A and 1 B only) 


Four-noun 


... a level, a screwdriver, a hammer, and a 




saw. (Experiments 1 A and IB only) 


Five-noun 


... a wrench, a level, a screwdriver, a 




hammer, and a saw. (all experiments) 




Anaphor 


She fixed it with the cutting tool before it 




broke, (all experiments) 


No anaphor 


She fixed the table all by herself before it 




broke. (Experiment 3 only) 




Referent 


SAW (Experiments 1A, 2A, 2B, and 3) 


Distractor 


HAMMER (Experiments 1 B, 2A, and 2B) 


Comprehension question 


Did Amelia use the saw? (all experiments) 



accuracy in responding to the probes wiU reflect the ease or dif- 
ficulty the subjects have in selecting the correct referent (cf. Dell 
et al, 1983; Levine et al., 2000) from the list of potential referents, 
including the distractors and the referent. 

We hypothesized that increasing the number of distractors 
would lead activation from the anaphor to spread among the ref- 
erent and distractor concepts (Kintsch, 1988; Myers and O'Brien, 
1998; Lewis and Vasishth, 2005). It was expected that the spread 
of activation from the anaphor to all conceptually-related poten- 
tial referents would cause the referent to be less active following 
anaphor resolution as the number of distractors increased (i.e., 
a monotonic increasing trend in reaction time and decreasing 
trend in accuracy was expected), resulting in lower probe accu- 
racy and longer probe recognition times. Additionally, this spread 
of activation should interfere with the selection of the appropriate 
referent during anaphor resolution, thus slowing reading of the 
reference sentence, replicating several findings (e.g., Corbett and 
Chang, 1983; Corbett, 1984; Mason, 1997; Badecker and Straub, 
2002). Alternatively, it is possible that a backward, parallel-search 
process occurs such that the earlier-occurring distractors have 
little or no detectable impact on anaphor resolution (O'Brien, 
1987). A backward, serial, self-terminating search would also pre- 
dict no impact of early distractors on resolution of later referents. 
This latter strategy seems attractive especially in short passages 
with a list-like first sentence (cf Townsend and Fific, 2004). 

METHOD 

Subjects 

Ninety-five students enrolled in a general psychology course at the 
University of Arkansas participated in the experiment to partially 
fulfill a research requirement. All subjects were native-English 
speakers. Informed consent was obtained from all subjects in this 
and all subsequent experiments. 

Materials and design 

There were 31^ experimental sentence-pairs that appeared in one 
of four conditions (see Table 1). Each sentence-pair began with 
a list sentence that introduced a character by proper name (half 
stereotypically male, half stereotypically female) and ended in a 
list of either two, three, four, or five nouns from the same taxo- 
nomic category. The nouns were all single words, common, and 
were selected to be roughly equal in typicality as judged by the 
first author and several research assistants. Furthermore, each of 
the last two nouns in the list was able to be distinguished from 
the other nouns by means of an adjective (e.g., saws can be distin- 
guished from the other tools in the list using the adjective cutting). 
The list sentence was followed by a reference sentence that unam- 
biguously referred to the final item in the list using an adjective 
and a categorical anaphor (e.g., cutting tool) that was the same for 
all conditions. The anaphor always occurred three words prior to 
the end of the reference sentence to ensure that there was enough 
time for the anaphor to be resolved by the time the sentence was 
fully read (i.e., by the time the probe-word task was presented). 



Experimenter error resulted in an odd number of experimental items in this 
experiment and in Experiment lb. 
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In addition, there were 68 filler sentence-pairs that each 
included a list sentence with two to five nouns but that were not 
limited by the same restrictions on nouns in the experimental 
lists (e.g., the nouns could be proper or multiple words). As with 
the experimental sentence pairs, the filler reference sentences also 
included a categorical anaphor modified by an adjective; however, 
the referent of the anaphor was not always completely unambigu- 
ous. Moreover, the referent of the anaphor was not always the last 
item in the list. These two features of the fillers were expected to 
encourage subjects to put forth more effort in resolving anaphors 
across all trials. 

Each experimental and fiUer sentence-pair also had a cor- 
responding recognition probe and comprehension question. 
Following the reference sentence, subjects completed a probe 
recognition task in which they indicated whether a word on the 
screen had occurred in the previous sentence-pair. For experi- 
mental sentence-pairs, the probe word was always the final noun 
from the list, which required a "yes" response. To ensure an equal 
number of "yes" and "no" responses across the experiment, the 
majority of the filler probe tasks presented a word that did not 
occur in the sentence pair and therefore required a "no" response. 
Other fillers presented a probe word that was not the final noun 
from the list, requiring a "yes." Finally, a comprehension ques- 
tion was presented following the probe recognition task, half of 
which required a "yes" response and half of which required a "no" 
response. Comprehension questions frequently, but not always, 
focused on correct resolution of the anaphor (e.g.. Did Amelia use 
the saw?). 

Subjects saw each experimental sentence-pair in one of the 
four conditions along with all filler sentence-pairs. Four coun- 
terbalanced lists were created with the following constraints: one 
quarter of the list sentences had two nouns, one quarter had three 
nouns, one quarter had four nouns, and one quarter had five 
nouns. Furthermore, a second set of materials^ was created that 
reversed the order of the final two nouns in the list, such that final 
noun in the first set of materials (e.g., saw) became the penul- 
timate noun and the formerly penultimate noun (e.g., hammer) 
became the final noun. This also required a change in the disam- 
biguating adjective in the reference sentence (e.g., cutting changed 
to pounding) such that the referent of the categorical anaphor was 
always the final noun. The manipulation of these factors resulted 
in a design that was 4 (nouns: 2, 3, 4, 5) x 2 (noun order: order 1, 
order 2). 

Procedure 

The experiment began with three practice blocks of five trials 
each, which were intended to familiarize the subject with the 
yes/no response keys, the probe recognition task, and the com- 
prehension question, respectively. For all practice trials, feedback 
about the correctness of subjects' responses was provided. 

Subjects then began the experimental session. Subjects were 
instructed to read the sentences as they normally would for 



^ Probe length and frequency was similar for the two sets of materials (length: 
Set 1 M = 6.5 letters, SD = 1.5; Set 2M = 6.4letters, SD = 1.7; log frequency 
(Lund and Burgess, 1996; Balota et al., 2007): Set 1 M = 7.4, SD = 2.4; Set 2 
M = 7.2, SD = 2.9). 



comprehension and to respond to the probe words as quickly and 
accurately as possible. Each trial consisted of a list sentence, a ref- 
erence sentence, a probe word, and a comprehension question. 
At the beginning of each trial, subjects were given the instruc- 
tion "PRESS THE SPACEBAR WHEN READY" centered on a 
computer monitor. When they pressed the spacebar, the list sen- 
tence appeared left-justified in the middle of the screen. Subjects 
pressed the spacebar to indicate when they had finished reading 
the list sentence, which removed the list sentence from the screen 
and replaced it with the reference sentence. Subjects pressed the 
spacebar again to indicate when they had finished reading the ref- 
erence sentence, which removed the reference sentence from the 
screen and replaced it with a probe word in all capital letters in the 
center of the screen. Subjects used the left and right arrow keys 
labeled "Y" and "N" for yes and no, respectively, to respond to the 
probe task. This removed the probe word and replaced it with a 
comprehension question in the center of the screen; no feedback 
about correctness was provided for probes or questions. Subjects 
again used the yes and no keys to respond to the comprehension 
question, which ended the trial. 

The experimental session consisted of 99 trials (31 experi- 
mental and 68 fillers) in three blocks of 25 trials and one block 
of 24 trials. The order of the blocks, as well as the order of 
the trials within each block, was randomized with the restric- 
tion that the first sentence-pair of each block was always a filler 
sentence-pair, to allow time for the subjects to fully return their 
attention to the task after a mandatory 10 s break between blocks. 
Subjects were free to take breaks between trials. The experiment 
lasted approximately 30 min. The procedure for this and all sub- 
sequent experiments were approved by the University of Arkansas 
Institutional Review Board. 

RESULTS 

Data exclusion and general analytic considerations 

A subject's data were excluded from further analysis if they met 
any of the following criteria: ( 1 ) they had more than 30% of read- 
ing times less than 1000 ms or greater than 7500 ms; (2) they had 
lower than 70% probe recognition accuracy; (3) they had more 
than 30% of probe reaction times less than 500 ms or greater 
than 2500 ms; (4) they had no non-outlying probe recognition 
observations in at least one condition; or (5) they had less than 
70% comprehension question accuracy. Based on these criteria, 
the data from eight subjects were excluded from further analysis. 
Additionally, two experimental items were removed from further 
analysis due to counterbalancing errors. Therefore, the reported 
analyses include 85 subjects and 29 items. 

For all experiments reported in this paper, subject and item 
condition means were analyzed separately; a subscript of 1 indi- 
cates that subjects were treated as a random-effects variable, 
whereas a subscript of 2 indicates that items were treated as 
a random-effects variable. For all significance tests, an alpha 
level of 0.05 was used. Predictions about monotonic increasing 
and decreasing trends were tested using polynomial contrasts. 
For all repeated-measures effects with more than one numera- 
tor df, Huynh-Feldt adjusted p-values are reported to correct for 
sphericity violations. Effect-size measures that are reported are 
based on the subject analyses, and all within-subject standard 
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errors in figures and tables were computed using the method 
recommended by Loftus and Masson (1994). 

Comprehension 

In general, the number of nouns did not affect comprehen- 
sion (see Table 2 for comprehension results across all experi- 
ments). The linear trend was non-significant, _Fi(i_ 84) = 0.34, 
p = 0.56, -F2(i, 56) = 0.07, p = 0.79, with no significant higher- 
order trends. (See Appendix C in Supplementary Materials for 
the results of the noun-order factor in this experiment and 
Experiment IB.) 

Probe accuracy 

Figure 1 presents mean probe word accuracy and reaction times 
along with mean reference-sentence reading times as a function 
of the number of referents. In general, accuracy decreased as 
the number of nouns in the list sentence increased. The linear 



trend was significant, Fi( 



1,84) 



9.63, p = 0.003, _F2(i,28) = 9.99, 



p = 0.004, rip = 0.10, with no significant higher-order trends. 
Probe reaction times (RT) 

Only correct probes were analyzed. Outliers were first classi- 
fied as RTs that were less than 400 ms or greater than 3000 ms. 



Table 2 | Mean comprehension for all experiments (with standard 
errors of the mean). 





Two-noun 


Three-noun 


Four-noun 


Five-noun 


Experiment 1 A 


0.88 (0.014) 


0.86 (0.014) 


0.87 (0.014) 


0.87 (0.014) 


Experiment 1 B 


0.89 (0.014) 


0.87 (0.015) 


0.83 (0.017) 


0.86 (0.016) 


Experiment 2A 










Referent 


0.95 (0.012) 






0.92 (0.015) 


Distractor 


0.93 (0.012) 






0.91 (0.014) 


Experiment 2B 










Referent 


0.93 (0.013) 






0.88 (0.021) 


Distractor 


0.92 (0.015) 






0.91 (0.015) 


Experiment 3 










Anaplior 


0.88 (0.019) 






0.90 (0.014) 


No anaphor 


0.96 (0.011) 






0.91 (0.015) 



-RT -»- accuracy 




Remaining reaction times more extreme than 1.5 times the 
interquartile range above the 75th percentile or below the 25th 
percentile for each subject were classified as outliers (Tukey, 
1977), resulting in 8.6% of the data being excluded from fur- 
ther analyses. In general, reaction time increased as the number 
of nouns in the list sentence increased (see Figure 1). The linear 
trend was significant, 34) = 8.03, p = 0.006,^2(1,28) = 6.68, 
p = 0.02, fjp = 0.09, with no significant higher-order trends. 

Reference-sentence reading times 

Reference-sentence reading times were transformed to per- 
character reading times by dividing the full-sentence reading time 
by the number of characters in the sentence, not counting spaces 
and punctuation (see Table 3). Outliers were first identified as tri- 
als with less than 15 ms/char or more than 150ms/char. Outiiers 
among the remaining reading times were then identified within 
each subject based on Tukey 's (1977) criteria. 7.6% of the trials 
were excluded from further analysis. In general, reading time on 
the reference sentence decreased as the number of nouns in the 
list sentence increased. The linear trend was significant, Pi(i.84) = 
19.55, p < 0.001,^2(1,30) = 10.87, p = 0.003, !?^ = 0.19, with no 
significant higher-order trends. 

DISCUSSION 

The results of the probe word analyses were consistent with the 
fan-effect hypothesis and generally favor models of anaphor res- 
olution that posit a parallel-search mechanism in retrieval of the 
correct referent. As predicted, the presence of distractors inter- 
fered with the probe recognition task. Increasing the number of 
distractors in the list sentence decreased recognition accuracy and 
increased reaction times for referents, which suggests that the 
activation level of referents decreased as the number of distrac- 
tors increased. The existing literature has shown via a variety of 
measures and paradigms that the presence of one distractor inter- 
feres with anaphor resolution (e.g., Corbett and Chang, 1983; 
Corbett, 1984; Mason, 1997; Levine et al, 2000; WUey et al, 2001; 
Klin et al, 2004, 2006; Ditman et al, 2007; Levine and Hagaman, 
2008). The present results extend this finding by demonstrating a 
cumulative effect of distractors. 

The effect of additional nouns on the subsequent reference- 
sentence reading times, however, was unexpected. It was pre- 
dicted, based on previous research (e.g., Corbett, 1984), that 
anaphor resolution would be slowed by the presence of dis- 
tractors, resulting in longer full-sentence reading times as the 
number of distractors increased. However, the results were exactly 
the opposite, indicating that the subjects actually read the ref- 
erence sentences more quickly as the number of distractors 



Table 3 | Experiment 1A mean per-character reading times in ms 
(with standard errors of the mean). 



FIGURE 1 I Experiment 1A antecedent probe reaction times and 
accuracies by noun condition (error bars indicate SE of the mean). 



Two-noun 
Tliree-noun 
Four-noun 
Five-noun 



List sentence 

73.2 (0.9) 
74.0 (0.9) 
76.8 (0.9) 
774 (0.9) 



Reference sentence 

58.3 (0.7) 
575 (0.7) 

55.4 (0.7) 
53.9 (0.7) 
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increased. Assuming this is not a Type I error, one plausible 
explanation for this result is that subjects adopted a strategy 
of speeding through the reference sentence to reduce the time 
between the referents and the probe recognition task when there 
were more distractors. A similar finding was reported by Van 
Dyke and McElree (2006), who had subjects reading sentences 
of variable complexity while holding or not holding a mem- 
ory load and found that reading was faster for more-complex 
sentences with a memory load than without one. This speeded- 
reading strategy as a potential alternative explanation for the 
fan effect was explored in further detail in Experiments 2A 
and 2B; we defer discussion until the presentation of those 
experiments. 

EXPERIMENT IB 

Experiment lA established that referents were less active follow- 
ing anaphor resolution when there were more potential referents 
available in the discourse. Experiment IB replicated Experiment 
lA but used distractors as the probe words to test the effect of 
multiple distractors on the activation level of a distractor. As 
in Experiment lA, it was hypothesized that additional distrac- 
tors would decrease probe accuracy and slow probe recognition 
times. If anaphors act like any other cue to memory, the acti- 
vation of the referent and distractors should be split (Kintsch, 
1988; Myers and O'Brien, 1998; Lewis and Vasishth, 2005), 
resulting in less activation to go around (i.e., a fan effect) as 
there are more related concepts in the list sentence. Because 
the anaphor contains two cues (i.e., adjective plus noun) to 
retrieve the referent but only one (i.e., the noun) that matches 
the distractors, referents should become more active and expe- 
rience less interference (i.e., a reduced fan effect) than distrac- 
tors following anaphor resolution. Moreover, later items may 
overwrite or displace earlier items, leading to degraded repre- 
sentations of the referent and especially earlier-occurring dis- 
tractors (Nairne, 1990; Lewis, 1996). We examine these pre- 
dictions in a cross-experiment comparison after presenting the 
results of Experiment IB, and then examine them more directly 
(i.e., in a completely within-subjects design) in Experiments 
2A and 2B. 



for a systematic misunderstanding of the instructions. These sub- 
jects consistently responded "no" to distractors on the probe task 
when they should have been responding "yes." This pattern of 
responding suggests that these subjects were correctly identify- 
ing the correct referent of the anaphor, but misunderstanding 
that this identification was unrelated to the probe task. Therefore, 
the comprehension accuracy, probe accuracy, and reading time 
analyses included 54 subjects and 31 items. 

Comprehension 

In general, comprehension (see Table 2) decreased as the num- 
ber of nouns increased. The linear trend was significant in the 
subject analysis, 53) = 5.36, p = 0.025, rip = 0.09, but non- 
significant in the items analysis, -F2(i, eo) = 2.91, p = 0.093, with 
no significant higher-order trends. 

Probe accuracy 

Figure 2 presents mean probe word accuracy and reaction times 
along with mean reference-sentence reading times as a func- 
tion of the number of referents. In general, accuracy decreased 
as the number of nouns in the list sentence increased. The lin- 
ear trend was significant, _Fi{i, 53) = 39.08, p < 0.001, -F2{i, 30) = 
45.28, p < 0.001, i-ij = 0.42, with no significant higher-order 
trends. 

Probe reaction times 

Based on outlier exclusion criteria, 9.6% of the data were excluded 
from further analyses. In general, reaction time increased as the 
number of nouns in the list sentence increased (see Figure 2). 
The linear trend was significant, _Fi(i_ 53) = 16.79, p < 0.001, 
^^2(1, 30) = 6.59, p = 0.01, rip = 0.24. There was also an unex- 
pected cubic trend, 53) = 12.81, p = 0.001, ^2(1, 30) = 3.13, 
p = 0.09. There was no theoretical expectation of this effect, 
and it did not appear in Experiment lA, so we did not try to 
interpret it. 

Reference-sentence reading times 

Based on outlier exclusion criteria, 5.2% of the data were excluded 
from further analyses. In general, as in Experiment lA, read- 
ing time (see Table 4) on the reference sentence decreased as 



METHOD 
Subjects 

Seventy-eight students enrolled in a general psychology course 
at the University of Arkansas participated in the experiment 
to partially fulfill a research requirement. All subjects were 
native-English speakers. 

Materials, design, and procedure 

Experiment IB was identical to Experiment lA except that the 
probe words in the probe recognition task for experimental trials 
were distractors (i.e., the penultimate word in the list). 

RESULTS 

Data exclusion and general analytic considerations 

Based on the data exclusion criteria detailed in Experiment lA, 
the data from eight subjects were excluded from further anal- 
ysis. Sixteen more subjects were removed from further analysis 



-^RT -"-accuracy 
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FIGURE 2 I Experiment 1B distractor probe reactions times 
accuracies by noun condition (error bars indicate SE of tlie 
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Table 4 | Experiment 1B mean per-character reading times in ms (witii 
standard errors of tlie mean). 





List sentence 


Reference sentence 


Two-noun 


77.0 (1.11) 


60.9 (0.88) 


Three-noun 


80.3 (1.11) 


59.8 (0.88) 


Four-noun 


80.3 (1.11) 


58.3 (0.88) 


Five-noun 


82.5 (1.11) 


574 (0.88) 



the number of nouns in the list sentence increased. The lin- 
ear trend was significant, Fi(i,53) = 11.74, p = 0.001, p2(i, 30) = 
11.52, p = 0.002, 1^j = 0.18, with no significant higher-order 
trends. 

DISCUSSION 

The probe word results were again consistent with the fan-effect 
hypothesis. As predicted, the presence of distractors interfered 
with the probe recognition task. Increasing the number of ref- 
erents in the list sentence decreased recognition accuracy and 
increased reaction times for distractors similar to the effect found 
for referents in Experiment lA. This result extends the findings of 
Experiment 1 A to show that distractors also decrease in activation 
as the number of referents increases. 

As in Experiment lA, the reading-time results did not support 
the fan-effect hypotheses. Subjects again read the reference sen- 
tence more quickly as the number of distractors increased. This 
replication provides additional confidence that the unexpected 
results were not occurring due to chance. This issue was explored 
in further detail in Experiments 2A and 2B. 

EXPERIMENTS 1 A AND 1 B COMBINED ANALYSIS 

As noted in the introduction to Experiment IB, the effect 
of fan size should be different for referents (Experiment lA) 
and distractors (Experiment IB). To compare the magnitude 
of the effect of the number of nouns on referents and dis- 
tractors, an additional analysis was conducted for the probe 
reaction times from Experiments lA and IB. Probe reaction 
times for each subject in both experiments were first lin- 
early regressed on the number of nouns (cf. Lorch and Myers, 
1990), and the slopes were then examined in an independent- 
samples f-test with experiment (i.e., probe: referent vs. distractor) 
as a between-subjects variable. This analysis revealed a non- 
significant effect of probe in the expected direction, with a 
substantially smaller mean slope among subjects responding to 
referents in Experiment lA (Msiope = 15.2ms/noun, SE = 5.4) 
than among subjects responding to distractors in Experiment 
IB (Msiope = 31.3ms/noun, SE = 7.6), t(i37) = 1.77, p = 0.08, 
d = 0.30. 

A similar analysis performed on the accuracy data revealed 
a large and significant effect of probe, with a substantially 
smaller mean slope among subjects responding to referents in 
Experiment lA (Msiope = —0.014 accuracy/noun, SE = 0.0046) 
than among subjects responding to distractors in Experiment IB 
(Msiope = -0.052 accuracy/noun, SE = 0.0084), f(i37) = 4.33, 
p < 0.001, d = 0.74. Although referents likely gained an advan- 
tage in both accuracy and speed of responding due to having 



appeared more recently than distractors, these analyses focused 
on the linear trends in which distance from the probe were equal. 
Therefore, these results provide evidence that the interference 
effect is greater for distractors than referents; this effect was tested 
more directly in Experiments 2A and 2B. 

EXPERIMENT 2A 

The procedure for Experiment 2A was modified from that in 
Experiments lA and IB such that subjects read the reference 
sentence one word at a time. This allowed for a more detailed 
analysis of the reading times, which was necessary to help under- 
stand the unexpected reference-sentence reading time results of 
Experiments lA and IB. The prediction that additional distrac- 
tors should slow reading of the reference sentence was based 
on the hypothesis that multiple distractors would interfere with 
anaphor resolution. This means that the expected slowdown 
should occur specifically on the anaphor or immediately after the 
anaphor during spillover processing. According to this hypothe- 
sis, it was expected that there should be no difference in reading 
times on the reference-sentence until subjects reach the anaphor 
and post-anaphor regions, where they were expected to read more 
slowly as the number of distractors increased. However, if the 
results of Experiments lA and IB are reliable, then there should 
be longer reading times when there are more distractors at some 
point in the reference sentence prior to the anaphor. 

In addition. Experiments lA and IB demonstrated that the 
presence of multiple distractors made recognition of both refer- 
ents and distractors more difficult, as indexed by both reaction 
time and accuracy. Experiments 2A and 2B were designed to 
manipulate the probe word within subjects to address poten- 
tial concerns about comparing results across experiments. Thus, 
in these experiments, probe word (referent vs. distractor) and 
number of distractors (two vs. five) were manipulated within sub- 
jects. The fan-effect hypothesis predicts that additional distractors 
would slow recognition and decrease accuracy for both referents 
and distractors. Moreover, to the extent that anaphor resolution 
focuses activation on the referent, thereby minimizing interfer- 
ence, the degree of interference should be greater for distractors 
than for referents. 

METHOD 

Subjects 

Seventy-five students enrolled in a general psychology course 
at the University of Arkansas participated in the experiment to 
partially fulfill a research requirement. AH subjects were native- 
English speakers. 

Materials and design 

Thirty of the experimental materials from Experiment 1 were 
used and appeared in only the two- and five-noun list conditions. 
This also required some modification of the list length in the filler 
sentences to maintain an equal distribution of list lengths across 
the entire experiment. In addition, the probe words were manipu- 
lated within subjects, such that each subject saw an equal number 
of referent and distractor probes following experimental items. 

Subjects saw each experimental sentence pair in one of 
the four conditions along with all filler sentence pairs. Four 
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counterbalanced lists were created with the following constraints: 
approximately (i.e., 7 or 8 items) one quarter of the list sentences 
had two nouns followed by a referent probe, approximately one 
quarter had two nouns followed by a distractor probe, approxi- 
mately one quarter had five nouns followed by a referent probe, 
and approximately one quarter had five nouns followed by a dis- 
tractor probe. Because counterbalancing order did not have any 
important effects in Experiments lA and IB, order was no longer 
manipulated, resulting in a 2 (nouns: 2, 5) x 2 (probe word: 
referent, distractor) completely within-subjects design. 

Procedure 

The experiment was conducted using Linger (Rohde, 2003) to 
present the materials using a moving window (Just et al., 1982). 
Before starting the experiment, subjects completed three prac- 
tice trials to familiarize themselves with the procedure. Each trial 
began with two rows of dashes, centered on the left-hand side 
of the screen, with each dash replacing a character or space in 
the sentences. Subjects pressed the spacebar to initially present 
the list sentence in its entirety. When they finished reading the 
list sentence, subjects pressed the spacebar again which replaced 
the list sentence with dashes and revealed the first word of the 
reference sentence. Subjects continued to press the spacebar to 
advance from one word to the next, with each press replacing 
the previous word with dashes and revealing the next word in 
the sentence. Pressing the spacebar after the final word of the ref- 
erence sentence removed all of the dashes from the screen and 
presented a probe word in all capital letters in the center of the 
screen. Subjects responded to the probe word using the F key for 
yes and the J key for no. The response removed the probe word 
from the screen and replaced it with a comprehension question. 
Subjects again responded using the F and J keys, which advanced 
the screen to the next trial. 

The experimental session consisted of 98 trials (30 experimen- 
tal and 68 fillers) in two blocks of 49 trials each with the order of 
the trials completely randomized. Subjects were instructed to read 
the sentences as they normally would for comprehension and to 
respond to the probe words as quickly and accurately as possible. 
Subjects were free to take breaks between trials. The experiment 
lasted approximately 30 min. 

RESULTS 

Data exclusion and general analytic considerations 

Based on the data exclusion criteria, the data from six sub- 
jects were excluded from further analysis. Therefore, the reported 
analyses included 69 subjects and 30 items. 

Comprehension 

In general, comprehension (see Table 2) decreased as the num- 
ber of nouns increased. A 2 (nouns: 2, 5) x 2 (noun probed: 
referent, distractor) repeated-measures ANOVA revealed a main 
effect of nouns that was non-significant in the subject analy- 
sis, f 1(1, 68) = 3.38, p = 0.07, r]p = 0.05, but significant in the 
items analysis, -F2(i, 29) = 4.82, p = 0.04. The main effect of noun 
probed was non-significant, _Fi(i_ 68) = 2.62, p = 0.11,_F2{i, 29) = 
1.07, p = 0.31, and the interaction between number ofnouns and 
noun probed was also non-significant, _Fi(i^ 68) = 0.01, p = 0.92, 
^2(1, 29) =0.14,p = 0.71. 



Probe accuracy 

Table 5 presents mean accuracy and probe reaction times as 
a function of the number of nouns and the noun probed. In 
general, accuracy was higher for referents than for distractors 
and when there were two nouns in the list sentence than when 
there were five. A 2 (nouns: 2, 5) x 2 (noun probed: refer- 
ent, distractor) repeated-measures ANOVA revealed a significant 
main effect of the number of nouns, -Fi(i, 68) = 28.34, p < 0.001, 
^2(1, 29) = 25.46, p < 0.001, rip = 0.29, as well as a significant 
main effect of the noun probed, -Fi(i_ 68) = 17.62, p < 0.001, 
^2(1, 29) = 28.98, p < 0.001, rjj = 0.21. There was also a signif- 
icant interaction between the number of nouns in the sentence 
and the noun being probed, _Fi(i_ 68) = 4.51, p = 0.04,^2(1, 29) = 
4.37, p = 0.05, rjp = 0.06, with a greater 2- vs. 5-noun difference 
for distractors than for referents, replicating the effect seen in 
the between-experiments comparison presented above. Planned 
pairwise comparisons revealed a significant effect of the number 
of nouns for both the referent probes, ti(68) = 3.04, p = 0.003, 
f2(29) = 3.69, p = 0.001, d = 0.37, and the distractor probes, 
ti(68) = 4.53,|J < 0.001, f2(29) = 4.17,p < 0.001, d = 0.55. 

Probe reaction times 

Based on outlier exclusion criteria, 7.8% of the data were excluded 
from further analyses. Like the accuracy results, reaction time 
tended to be shorter for referents than for distractors and when 
there were two nouns in the list sentence than when there were 
five. A 2 (nouns: 2, 5) x 2 (noun probed: referent, distractor) 
repeated-measures ANOVA revealed a significant main effect of 
the number of nouns, _Fi(i 68) = 4.20, p = 0.04, ^2(1, 29) = 5.99, 
p = 0.02, rip = 0.06, as well as a significant main effect of the 
noun probed, f 1(1, 68) = 12.73, p = 0.001, ^2(1, 29) = 19.18, p < 
0.001, rip = 0.16. Despite the pattern of means replicating the 
cross-experiment interaction seen in Experiments lA and IB, 
there was not a significant interaction between the number of 
nouns in the sentence and the noun being probed, -Fi{i,68) = 
0.2&,p = 0.60, ^2(1, 29) = 2.64, p = 0.12. Planned pairwise com- 
parisons revealed a non-significant 46 ms effect of the number 
of nouns for the antecedents, fi(68) = 1-35, p = 0.18, f2(29) = 
0.80, p = 0.43, but the 73 ms effect of the number of nouns 



Table 5 | Experiments 2A and 2B mean probe word responses (with 
standard errors of the mean). 






Experiment 2A 






Accuracy 


Reaction time (ms) 




Referent 


Distractor 


Referent 


Distractor 


Two-noun 
Five-noun 


0.97 (0.013) 
0.93 (0.013) 


0.92 (0.013) 
0.83 (0.013) 


1553 (25.1) 
1599 (25.1) 


1632 (25.1) 
1705 (25.1) 






Experiment 2B 






Accuracy 


Reaction time (ms) 




Referent 


Distractor 


Referent 


Distractor 


Two-noun 
Five-noun 


0.95 (0.017) 
0.94 (0.013) 


0.92 (0.019) 
0.72 (0.022) 


1151 (21.5) 
1245 (21.5) 


1273 (21.5) 
1403 (21.5) 
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for distractor probes, though not significant by subjects, fi(68) = 
1.73, p = 0.09, was significant by items, f2(29) = 3.00, p = 0.005, 
d = 0.21. For the sake of comparison with Experiments lA and 
IB, in Experiment 2A the slope of the number of nouns among 
the referents was 15.4 ms/noun, whereas the slope of the number 
of nouns among the distractors was 24.2 ms/noun. These values 
were 15.2 and 31.3, respectively, in Experiments lA and IB. 

Reference-sentence reading times 

Outliers were first identified as words read for less than 150 ms or 
more than 700 ms; different criteria were used in this experiment 
to try to approximate in a per-word measure the per-character 
measures used in the previous experiments. Outliers among the 
remaining reading times were then identified within each subject 
based on Tukey's (1977) criteria. This resulted in 8.1% of the trials 
being excluded from further analysis^. 

The individual-word reading times were combined into three 
regions of three words each. The pre-anaphor region consisted 
of the three words prior to the anaphor; the anaphor region 
consisted of the three-word noun phrase involving the deter- 
miner, adjective, and anaphor (e.g., the cutting tool); and the 
post-anaphor region consisted of the three words following the 
anaphor. Although some items had more than three words prior 
to the anaphor noun phrase, the analysis was restricted to this 
point because there was a dramatic drop in the number of obser- 
vations starting four words prior to the anaphor region. The 
post-anaphor region was always the final three words of the 
anaphor sentence. Thus, each region consisted of three words, 
making their reading times roughly comparable. 

In general, reading time on the reference sentence decreased as 
the number of nouns in the list sentence increased (see Figure 3); 
this effect occurred most strongly in the pre-anaphor region. A 2 
(nouns: 2, 5) x 3 (region: pre-anaphor, anaphor, post-anaphor) 
repeated measures ANOVA revealed a significant main effect 
of the number of nouns only in the items analysis, f 1(2, 68) = 
2.68, p = 0.11, f2{i, 29) = 4.66, p = 0.04, r]j = 0.04. There was 
also a significant main effect of region, fi(2, 136) = 29.9, p < 
0.001, p2(i, 58) = 16.8, p < 0.001, rjj = 0.31, but the interaction 
between the number of nouns and region was non-significant, 
^■1(2, 136) = 0.65, p = 0.53, F2(i, 58) = 1-03, p = 0.36. Planned 
pairwise comparisons revealed that subjects read the pre-anaphor 
region significantly faster in the five noun condition compared 
to the two noun condition (p = 0.02 by subjects, p = 0.05 by 
items), but this effect was non-significant in the anaphor region 
ip =0.26 by subjects, p = 0.16 by items) and the post-anaphor 
region {p = 0.51 by subjects, p = 0.21 by items). 

DISCUSSION 

As predicted by the fan-effect hypothesis, and consistent with 
Experiments lA and IB, probe word accuracy was higher and 
responses were made faster in the two-noun condition than in the 
five-noun condition for both referents and distractors. Moreover, 
the cross-experiment interaction of number of nouns and probe 
type was replicated; the fan effect is larger for distractors. The 



'The pattern of results remained similar using a less-strict cutoff of 
1500 ms/word. 



950 




800 



Pre-anaphor Anaphor Post-anaphor 
-^Two-noun -^Five-noun 

FIGURE 3 I Experiment 2A mean anaphor sentence reading times per 
region (error bars indicate SE of tlie mean). 



reading time results replicated those from Experiments lA and 
IB: subjects read the reference sentence faster in the five-noun 
condition than in the two-noun condition. However, measur- 
ing reading time per-word enabled a more detailed analysis of 
the reference-sentence reading times and revealed that the faster 
reading primarily occurred in the pre-anaphor region. Because 
this region was identical across conditions and made no reference 
to the list sentence, there is no theoretical reason to expect this 
difference based on anaphor resolution processes. Instead, these 
results support the speeded-reading explanation suggested in the 
discussion of Experiment lA, that subjects may have adopted a 
particular strategy in order to mitigate the increased difficulty 
of the probe-word task in the five-noun condition by reaching 
the probe word task and comprehension questions more quickly. 
Furthermore, per-character reading times on the list sentence (see 
Appendix B in Supplementary Material ) increased as the number 
of nouns increased, suggesting that the speeded-reading strat- 
egy was adopted only on the reference sentence after subjects 
became aware of the increased difficulty imposed by the longer 
lists. 

EXPERIMENT 2B 

Because subjects appeared to be adjusting their reading speed 
to accommodate the difficulty of representing multiple referents, 
it was important to assess whether the probe word results were 
dependent on this apparent strategy. Experiment 2B was thus 
a replication of Experiment 2A using a fixed-rate presentation 
of the passages. By controlling the pace of reading, any effects 
found on the probe recognition task can be assumed to reflect 
processes that occurred independent of subjects' variable read- 
ing speed. Holding reading-rate constant was not expected to 
change the probe-word results, so it was expected that responses 
to both referents and distractors would be faster and more accu- 
rate when there were two referents in the list sentence than when 
there were five referents. Moreover, this experiment provided one 
more opportunity to examine the prediction that the effect of fan 
would be greater among distractors than among referents. In the 
accuracy data, the fan effect has been reliably much stronger for 
distractors than it has been among referents. In the reaction-time 
data, between Experiments lA and IB, this effect was significant 
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only in a one-tailed test, and in Experiment 2A, the same pattern 
emerged but it was not reliable. 

METHOD 
Subjects 

Sixty-six students enrolled in a general psychology course at the 
University of Arkansas participated in the experiment to partially 
fulfill a research requirement. All subjects were native-English 
speakers. 

Materials, design, and procedure 

The materials, design, and procedure were identical to 
Experiment 2A except that the materials were presented at 
a fixed pace of 450 ms per word*. 

RESULTS 

Data exclusion and general analytic considerations 

Based on the data exclusion criteria, the data from 10 subjects 
were excluded from further analysis. Four more subjects were 
removed from further analysis for a systematic misunderstanding 
of the instructions. Therefore, the analyses included 50 subjects 
and 30 items. 

Comprehension 

In general, comprehension (see Table 2) decreased as the number 
of nouns increased. A 2 (nouns: 2, 5) x 2 (noun probed: refer- 
ent, distractor) repeated-measures ANOVA revealed a main effect 
of nouns that was significant in the subject analysis, _Fi{i, 49) = 
4.59, p = 0.04, rjp = 0.09, but non-significant in the items anal- 
ysis, -F2(i, 29) = 2.53, p = 0.12. The main effect of noun probed 
was non-significant, 49) = 0.20, p = 0.66, _F2(i, 29) = 0.23, 
p = 0.63, and the interaction between number of nouns and 
noun probed was also non-significant, _Fi(i 49) = 1.50,^ = 0.23, 
^2(1, 29) = 1.31,p = 0.26. 

Probe accuracy 

Table 5 presents mean accuracy and probe reaction times as a 
function of the number of nouns and the noun probed. In gen- 
eral, accuracy was higher for referents than for distractors and 
when there were two nouns in the list sentence than when there 
were five, once again replicating the pattern seen in Experiments 
lA, IB, and 2A. A 2 (nouns: 2, 5) x 2 (noun probed: ref- 
erent, distractor) repeated-measures ANOVA revealed a signifi- 
cant main effect of the number of nouns, -Fi(i. 49) = 53.9, p < 
0.001, F2(i, 29) = 27.9, p < 0.001, y-jj = 0.52, as well as a signifi- 
cant main effect of the noun probed, _Fi(i 49) = 53.0, p < 0.001, 
^2(1, 29) = 31.8, p < 0.001, rjj = 0.52. Additionally, there was a 
significant interaction between the number of nouns in the sen- 
tence and the noun being probed, _Fi(i 49) = 29.2, p < 0.001, 
^2(1, 29) = 23.1, p < 0.001, rij = 0.37, with a greater 2- vs. 5- 
noun difference for distractors than for referents, the third time 
this pattern has been replicated. 



*Due to limitations of the Linger program, words were presented at a fixed 
rate instead of using a variable rate dependent on the length of each word (cf. 
Gernsbacher, 1989). 



Probe reaction times 

Based on outlier exclusion criteria, 8.7% of the data were excluded 
from further analyses. Like the accuracy results, reaction time 
tended to be shorter for referents than for distractors and when 
there were two nouns in the list sentence than when there were 
five. A 2 (nouns: 2, 5) x 2 (noun probed: referent, distrac- 
tor) repeated-measures ANOVA showed that there was a sig- 
nificant main effect of the number of nouns, f 1(1 49) = 25.8, 
p < 0.001, _F2{i, 29) = 19.4, p < 0.001, r]j = 0.35, as well as a sig- 
nificant main effect of the noun probed, -Fi(i^ 49) = 44.1, p < 
0.001, _F2(i, 29) = 31.9,p < 0.001, = 0.47. Once again, the pat- 
tern of means replicated the cross-experiment pattern seen in 
Experiments lA and IB as well as that seen in Experiment 2A, 
with the effect of number of nouns being larger for distractors 
than for referents. Despite this, there was not a significant inter- 
action between the number of nouns in the sentence and the 
noun being probed, 49) = 0.70, p = 0.41, ^2(1, 29) = 2.03, 
p = 0.17. The effect of the number of nouns was significant 
among the referents, fi(49) = 2.86, p = 0.006, f2(29) = 2.54, p = 
0.02, as well as among the distractors, fi(49) = 4.55, p < 0.001, 
f2{29) = 4.87, p < 0.001; this effect was numerically smaller for 
referents (94 ms, d = 0.40) than for distractors (130 ms, d = 
0.64). The slopes corresponding to these effects, 31.2 ms/noun for 
referents and 43.2 ms/noun for the distractors, were substantially 
larger than the respective slopes seen in the previous experiments, 
possibly due to the change in the presentation of the passages to 
experimenter-paced. 

DISCUSSION 

The results confirmed the predictions of the fan-effect hypoth- 
esis, and the probe-word results were conceptually identical to 
Experiment 2A. Although subjects in the previous experiments 
seemed to be adopting a special strategy of reading the refer- 
ence sentence more quickly when there were more distractors, 
the results of Experiment 2B indicate that this strategy was not 
necessary for the emergence of the probe-word results we had pre- 
viously observed because subjects did not have the opportunity to 
employ it. The replication of the finding that the activation level of 
nouns decreases as the number of distractors increases therefore 
appears to be the result of a diffusion of activation to all potential 
referents. 

However, this conclusion relies on the assumption that sub- 
jects were resolving the anaphor and that the anaphor processing 
affected the activation level of the referents. There is some evi- 
dence, however, that anaphor resolution may not always occur 
during reading (Greene et al., 1992; Levine et al., 2000; Klin 
et al, 2004, 2006; Love and McKoon, 2011), making it possi- 
ble that the present results could be occurring independent of 
anaphor resolution. The effect of nouns may have been caused 
by the increasing memory demands incurred as the number of 
referents increased regardless of whether the subjects attempted 
to resolve the anaphors. It is possible that as the amount of 
information in the subjects' mental representations increased, the 
probability of the correct referent being activated even by the 
probe word itself independent of anaphor resolution processes, 
decreased, resulting in slower reaction times as the number of 
referents. 
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EXPERIMENTS 

Experiment 3 was designed to address the possibility that 
anaphors were not being resolved in the prior experiments. To 
do so, the reference sentence was modified such that it contained 
an anaphor or not (see Table 1), a manipulation that has been 
used many times in the anaphor resolution literature (e.g., Dell 
et al, 1983; Levine et al, 2000). As in Experiments 2A and 2B, 
there were either two nouns (i.e., a referent and one distractor) or 
five nouns (i.e., a referent and four distractors) in the list sentence 
that preceded the reference sentence. The referent was used as the 
probe word to provide an index of the activation of this concept at 
the end of the anaphor or no-anaphor sentence. According to the 
fan-effect hypothesis, it is activation from the anaphor as a mem- 
ory cue that is divided among the referent and the distractors that 
is the source of the effect of the number of nouns. Thus, when 
there is an anaphor, the fan-effect hypothesis predicts an effect of 
the number of nouns like that seen in the previous experiments. 
Whatever pattern emerges for the effect of the number of nouns in 
the anaphor condition, because anaphor resolution involves reac- 
tivation of the correct referent (e.g., Dell et al, 1983), there should 
be an overall accuracy and reaction time advantage in the anaphor 
over the no-anaphor control condition. 

METHOD 
Subjects 

Seventy students enrolled in a general psychology course at the 
University of Arkansas participated in the experiment to partially 
fulfill a research requirement. All subjects were native-English 
speakers. 

Materials and design 

Experiment 3 used the same set of materials as Experiments 
2A and 2B with the exception that the reference sentence was 
manipulated (see Table 1) such that it included an anaphor (i.e., 
Anaphor condition) or not (i.e.. No Anaphor condition), while 
equating for length (i.e., the mean length for both the anaphor 
and no anaphor conditions was 61.5 characters). Finally, the 
probe words were limited to referents only, as in Experiment lA. 
The manipulation of these factors resulted in a 2 (nouns: 2, 5) x 
2 (reference: anaphor, no anaphor) completely within-subjects 
design. 

Procedure 

The procedure of Experiment 3 was identical to that of 
Experiments lA and IB, except that it included only 98 trials (30 
experimental and 68 fillers), as in Experiments 2A and 2B. 

RESULTS 

Data exclusion and general analytic considerations 

Based on outlier identification and comprehension and probe 
accuracy, the data from 5 subjects were excluded from further 
analysis. Therefore, the reported analyses included 65 subjects 
and 30 items. 

Comprehension 

In general, comprehension (see Table 2) decreased as the number 
of nouns increased and accuracy was greater in the anaphor con- 
dition than in the no anaphor condition. A 2 (nouns: 2, 5) x 2 



(reference: anaphor, no anaphor) repeated measures ANOVA 
revealed a non-significant main effect of nouns, -Fi{i^ 64) = 0.62, 
p = 0.44, _F2(i, 30) = 0.90, p = 0.35, and a significant main effect 
of reference, 64) = 9.28, p = 0.003, ^2(1, 30) = 6.15, p = 
0.019, rjp = 0.13. In addition, there was a significant interaction 
between the number ofnouns and reference, f 1(1, 64) = 4.83, p = 
0.032, _F2(i, 30) = 5.23, p = 0.029, ijj = 0.07, with a 7.3% accu- 
racy advantage for the 2-noun condition compared to the 5-noun 
condition in the anaphor condition but only a 1.3% accuracy 
advantage in the no anaphor condition. However, the compre- 
hension questions differed between the anaphor and no anaphor 
conditions, making this the likely cause of the observed effect. 

Probe accuracy 

Table 6 presents mean probe word accuracy and reaction times 
along with mean reference-sentence reading times as a function 
of the number of referents and whether the reference sentence 
contained an anaphor. In general, subjects responded more accu- 
rately in the anaphor condition than in the no anaphor condi- 
tion and when there were two nouns in the list sentence than 
when there were five nouns. A 2 (nouns: 2, 5) x 2 (reference: 
anaphor, no anaphor) repeated measures ANOVA revealed a sig- 
nificant main effect of the number of nouns, -Fi(i, 64) = 8.43, 
p = 0.005, f2{i, 29) = 14.14, p = 0.001, 71^ = 0.12; however, the 
simple effect of the number of nouns for the anaphor condi- 
tion was not significant, ti(64) = 1.11, p = 0.27, f2(29) = 1-21, 
p = 0.24. There was also a significant main effect of reference, 
64) = 7.53, p = 0.008,^2(1,29) = 6.56, p = 0.02, = o.ll, 
but the interaction between the number of nouns and reference 
was non-significant, _Fi(i 64) = 2.70, p = 0.11, ^2(1, 29) = 2.84, 
p = 0.10. 

Probe reaction times 

Based on outlier exclusion criteria, 7.5% of the data were excluded 
from further analyses. Reaction times (see Table 6) tended to be 
faster in the anaphor condition than in the no anaphor condi- 
tion and when there were two nouns in the list sentence than 
when there were five nouns. A 2 (nouns: 2, 5) x 2 (reference: 
anaphor, no anaphor) repeated measures ANOVA revealed that 
the main effect of nouns was non-significant, _Fi(i 64) = 1-26, 



Table 6 | Experiment 3 mean probe word responses and per-character 
reading times (with standard errors of the mean). 

Anaphor condition 

Probe word responses Per-character reading time 

Accuracy Reaction time List sentence Reference sentence 

Two-noun 0.96(0.01) 976(9.9) 70.7(1.32) 55.7(0.64) 

Five-noun 0.94(0.01) 997(9.9) 73.7(1.32) 54.0(0.64) 

No anaphor condition 

Probe word responses Per-character reading time 

Accuracy Reaction time List sentence Reference sentence 

Two-noun 0.95(0.01) 1009(9.9) - 51.2(0.64) 

Five-noun 0.89(0.01) 1013(9.9) - 50.1(0.64) 
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p = 0.27, -F2(i, 29) = 0.24, p = 0.63. Because of the prediction of 
the fan-effect hypothesis, the effect of the number of nouns was 
examined for the anaphor condition. The noun-effect was 23 ms 
but was also not significant, f 1(64) = 1.26, p = 0.21, f2(29) = 0.80, 
p = 0.43. The main effect of reference was nearly significant in the 
subjects analysis, _Fi(i, 54) = 3.38, p = 0.07, r]p = 0.05, but non- 
significant in the items analysis, _F2(i, 29) = 2.32, p = 0.14, and 
the interaction between reference and nouns was not significant, 
64) = 0.65,p = 0.42,^2(1,29) = 1.09,p = 0.31. 

Reference-sentence reading times 

Based on outlier exclusion criteria, 6.5% of the data were excluded 
from further analyses. In general, reading time (see Table 6) 
was longer when the sentence contained an anaphor than when 
it did not. A 2 (nouns: 2, 5) x 2 (reference: anaphor, no 
anaphor) repeated measures ANOVA revealed a significant main 
effect of reference, f 1(1 _ 64) = 23.29, p < 0.001, _F2(i, 29) = 13.01, 
p = 0.001, fjp = 0.27. The main effect of nouns was not quite 
significant, _Fi(i, 64) = 3.08, p = 0.08, ^2(1, 29) = 1-56, p = 0.22, 
although the pattern observed in Experiments lA, IB, and 2 A 
appeared once again, with shorter reading times when there were 
more nouns. There was not a significant interaction between 
reference and nouns, Pi(i^ 64) = 0.23, p = 0.64, ^2(1, 29) = 0.55, 
p = 0.46. 

DISCUSSION 

The results from Experiment 3 provided some evidence that 
subjects were in fact resolving the anaphors when reading the 
passages. Probe accuracy was better after reading a sentence with 
an anaphoric reference than after reading a sentence that did 
not make an anaphoric reference. Additional evidence that sub- 
jects were resolving the anaphors comes from the reading-time 
data. Controlling for length, the reference sentences were read 
more slowly when they contained an anaphor than when they did 
not, consistent with the hypothesis that subjects were engaging 
in additional processing to resolve the anaphor. This conclusion 
is tentative, though, as there were more explicit references^ (e-g-> 
pronouns, specifiers, definite noun phrases) to entities in the 
prior sentence in the reference sentences in the anaphor con- 
dition (M = 2.8, SD = 0.8) than in the no-anaphor condition 
(M = 1.6, SD = 0.8), f(29) = 6.27, p < 0.001. In most cases (25 
of 30 passages), these additional references were not to any of the 
list items; excluding the five passages with a second reference to 
list items does not change the pattern of results for probe accuracy 
or reaction time reported above. 

The results of this experiment's anaphor condition were less 
consistent with the fan-effect hypothesis than the results from 
prior experiments, although the general pattern of degraded 
recognition performance with more nouns persisted; we return 
to this issue in the General Discussion. 

GENERAL DISCUSSION 

Explanations of how anaphoric expressions are understood have 
frequently appealed to general memory processes. Consistent 



Associative anaphora (e.g., referring to the test sitter a sentence mentioning 
studying) were not counted. 



with theories of comprehension that place memory at their cen- 
ter (e.g., Kintsch, 1988; Myers and O'Brien, 1998; Lewis and 
Vasishth, 2005), anaphor resolution is more difficult when fac- 
tors are present that make retrieving a unique item from memory 
more difficult, such as when there is similarity between a desired 
target and some distractor. Prior research that has produced find- 
ings that are consistent with this hypothesis (e.g., Corbett and 
Chang, 1983; Corbett, 1984; O'Brien, 1987; O'Brien et al, 1990; 
Greene et al, 1992; Levine et al., 2000; Badecker and Straub, 2002; 
Klin et al., 2004, 2006) have used stimuli with one distractor and 
one antecedent, and by a variety of measures anaphor resolution 
has been shown to be more difficult because of the distractor. In 
five experiments, we examined the hypothesis that a greater num- 
ber of distractors would lead to a fan effect (Anderson, 1974) in 
anaphor resolution, that is, if with each additional distractor there 
would be additional difficulty in identifying the correct referent of 
the anaphor. We also examined the effect of additional distractors 
on the activation of those distractors. Our subjects read pairs of 
sentences, the first of which provided a variably-long list of con- 
cepts from the same taxonomic category and the second of which 
made unambiguous reference to one of the items in the list with 
an adjective-modified definite noun phrase; this was followed by a 
probe recognition task that should provide an index of how active 
the probed concept is in the text representation. 

Collectively, the probe word results from the present exper- 
iments supported the hypothesis that distractors have a cumu- 
lative effect on antecedent activation levels. Although the effect 
of the distractors on reaction time varied in size and significance 
from experiment to experiment, it is overall a robust effect. The 
two- and five-noun conditions with a referent probe were present 
in Experiments lA, 2A, 2B, and 3. The subject data from these 
four experiments were combined and submitted to a 2 (nouns: 2, 
5) X 4 (Experiments: lA, 2A, 2B, 3) mixed-factor ANOVA with 
repeated-measures on the first factor. The effect of nouns was sig- 
nificant, _F(i 265) = 15.56,p < 0.001, and the interaction was not, 
F(3. 265) = 1-18, p = 0.32, suggesting that there was not signifi- 
cant variability in the effect of nouns across experiments. Cohen's 
d for the effect of nouns was 0.24 (95% confidence interval: 0.12, 
0.36; Smithson, 2003), demonstrating a small but reliable effect. 
Whereas previous research has shown that the presence of a sin- 
gle distractor interferes with the activation of the antecedent, the 
present research extends this finding by demonstrating that each 
additional distractor further reduces the activation level of the 
antecedent and other distractors. This effect is akin to a set size 
effect (Sternberg, 1966), with larger lists leading to longer reaction 
times; however, the difference in the size of the effect for refer- 
ents and distractors suggests that an additional process related to 
anaphor resolution is also occurring. 

The present results are conceptually similar to the fan effect 
where delayed recognition [i.e., the recognition task occurring 
after the presentation of all of the materials as in Anderson 
(1974)] slows as the number of facts associated with a noun 
increases. This effect is generally attributed to the reduction in the 
probability of the correct item in memory being activated at the 
time of retrieval, thus slowing responses. The present experiments 
demonstrate an earlier effect, with the number of distractors 
affecting the activation level of nouns immediately after each 
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trial. In this case, the categorical anaphor (e.g., tool) acts as a 
retrieval cue, with activation being split among all of the concepts 
associated with the category (i.e., the referent and distractor[s]). 
Increasing the number of distractors should therefore increase 
the time required to resolve the anaphor. This increased retrieval 
time effect was not observed in the present experiments, although 
this was likely due to subjects adopting a speeded-reading strat- 
egy (see the discussion of the reading-time results below). As a 
consequence of multiple potential antecedents, activation should 
be divided among the concepts, limiting the activation for each 
one (see spreading activation theory; Collins and Loftus, 1975; 
Anderson, 1983). This prediction was supported by the slowed 
reaction times and the reduced accuracy resulting from increasing 
the number of distractors. The present results further demon- 
strate that activation does not spread equally to all category 
members when there is disambiguating information (e.g., an 
adjective modifier like cutting in the cutting tool). Increasing the 
number of nouns led to a consistently greater reduction in probe 
accuracy and increase in reaction time for distractors than for 
referents in the Experiments lA and IB combined analysis and 
Experiments 2A and 2B, suggesting that activation was spreading 
disproportionately to the referent. 

We have framed the current results as primarily being an 
effect that occurs at the time of retrieval (i.e., upon reading the 
anaphor). It is possible that these effects are also influenced by 
encoding or storage interference. Upon reading multiple items 
with many shared features, like our list-sentence items, the mental 
representation of these items may be overwritten (Nairne, 1990) 
or degraded due to repeated reactivation by similar items (Estes, 
1997). The methodology used in the current research does not 
allow for delineation between a storage-based and a retrieval- 
based explanation. Ferreting out the relative contributions of 
storage- and retrieval-interference processes would likely require 
careful parametric manipulation of feature overlap among dis- 
tractors and the referent as well as precise control over not only 
timing of reading and probes but also time elapsed between stor- 
age and retrieval, as well as manipulation of serial position of 
distractors and referents. Attempting to work out these details is 
a promising avenue for future research. 

Turning to the reading-time results, we found no evidence that 
additional distractors led to more difficulty processing anaphoric 
reference. By contrast, we consistently found that our subjects 
read faster as there were more distractors. We believe that this 
is the result of subjects adopting a speeded-reading strategy on 
difficult trials (i.e., trials with longer lists of nouns), which coun- 
teracted the predicted increase in anaphor reading time. This is 
similar to Van Dyke and McElree's (2006) finding that, while read- 
ing grammatically-complex sentences, subjects read faster and 
had worse comprehension while holding a memory load (i.e., a 
list of three words) than when not holding a memory load, sug- 
gesting a dual-task strategic trade-off Our subjects also had lower 
comprehension with greater list length (see Table 2), suggest- 
ing that there was possibly a task demand that shifted attention 
somewhat from the comprehension aspect of the task to the mem- 
ory aspect of the task. In no case, however, was comprehension 
lower than about 83%. Moreover, there is no theoretical reason 
to expect anaphor resolution to take less time as the number of 



candidate antecedents increases unless subjects were giving up 
on trying to identify the correct antecedent (Levine et al., 2000). 
There are a few arguments consistent with the notion that sub- 
jects were in fact resolving the anaphors in the current research. 
First, correctly answering a large majority of the comprehen- 
sion questions required the anaphors to be resolved, which some 
have suggested is necessary to get subjects to resolve anaphors in 
anaphor resolution research (Foertsch and Gernsbacher, 1994). 
Second, some subjects, especially in Experiment IB, sponta- 
neously adopted the strategy of labeling distractors as new in the 
probe recognition task, which suggests that they had selected the 
referent as the "correct" answer and distractors as the "incor- 
rect" answer to the probe task. Third, Experiment 3 provides 
tentative evidence that subjects were resolving the anaphor, even 
on five-noun trials. Given these arguments and findings, we 
believe that our subjects were resolving anaphors even when it 
was difficult to do so. Therefore, the speeded-reading strategy 
appears to be the most parsimonious explanation of these unex- 
pected results. Furthermore, the fixed-pace presentation of the 
sentence in Experiment 2B prevented subjects from engaging in 
the speeded-reading strategy, demonstrating that the probe word 
effects do not rely on such a strategy. Future research should 
attempt to prevent the speeded-reading strategy while main- 
taining naturalistic reading (e.g., introducing a substantial delay 
between the passages and the probe task or eliminating the probe 
task entirely) in order to better evaluate the anaphor reading time 
hypothesis. 

Finally, returning to the fan-effect hypothesis, the original 
explanation offered for the fan effect by Anderson (1974) was 
based on Anderson and Bower's (1973) theory of memory, which 
assumed that memory retrieval was based on search cues being 
used to identify, in parallel, matching elements in memory, which 
were then serially examined, resulting in an increase in reac- 
tion time with each additional matching element. In the former 
detail (i.e., a parallel matching), this theory is in the same family 
as other global-matching memory theories like those of Ratcliff 
(1978), Gillund and Shiffrin (1984), and Hintzman (1986), upon 
which memory-based text processing frameworks like Myers and 
O'Brien's (1998) resonance model are based. In this sense, the 
results of our experiments are confirmation of both theories of 
memory search and the hypothesis that at least some aspects of 
comprehension may be explained by general memory processes. 
However, other research into the fan effect has shown that there 
are circumstances under which there is no fan effect despite there 
being multiple associations with a single memory cue (Myers 
et al, 1984; Radvansky, 1998; Radvansky et al, 1998). Myers et al. 
found no fan effect when memory elements could be integrated 
causally. For example, reading the elements the doctor went to the 
racetrack, the doctor studied the odds, and the doctor made a selec- 
tion may be readily integrated into a causally-coherent narrative 
representation about events occurring at a racetrack. Similarly, 
Radvansky and colleagues showed that the fan effect is reduced or 
even eliminated when potentially-competing memory elements 
can be readily integrated. One feature that makes elements easy to 
integrate is if they can occur at the same time (e.g., the grocer was 
folding a towel; the grocer was clearing his throat), whereas ele- 
ments that are in different locations may not be integrated (e.g.. 
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the welcome mat is in the cocktail lounge; the welcome mat is 
in the office building). Radvansky et al. observed a fan-effect in 
recognition of hard-to-integrate elements, but not for easy-to- 
integrate elements. Given that there are boundary conditions for 
the fan-effect in memory experiments, a natural question to ask 
is if there are circumstances under which the search process in 
anaphor resolution might occur without interference. Across sen- 
tences, one such circumstance might be if the items in a list occur 
in more-naturalistic texts, allowing for an integrated situation 
model to be constructed, as suggested by both Myers et al. (1984) 
and Radvansky (1998; Radvansky et al, 1998). By contrast, within 
sentences, one condition that has been shown to limit the search 
for referents is when there are strong grammatical constraints on 
reference. Recent evidence from Dillon et al. (2013; see also Chow 
et al, 2014) suggests that syntactic principles may guide retrieval 
in a constrained manner for some linguistic dependencies, such as 
reflexives (but see Badecker and Straub, 2002; Kennison, 2003 and 
Sturt, 2003, for further complexities), leading to retrieval with- 
out interference from distractors; syntactic constraints may play 
an especially critical role in directing the retrieval processes that 
occur within a sentence. These types of findings are representative 
of two distinct research literatures have arisen over the past few 
decades, one focused on retrieval across sentences, and the other 
focused on retrieval within sentences. Integration of these theo- 
ries and findings holds out the promise of yet further integration 
of theories of memory and comprehension. 
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